perm filename A27.TEX[154,RWF] blob
sn#807852 filedate 1985-10-10 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 \magnification\magstephalf
C00011 ENDMK
C⊗;
\magnification\magstephalf
\input macro.tex
\def\today{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi
\space\number\day, \number\year}
\baselineskip 14pt
\rm
\line{\sevenrm a27.tex[154,phy] \today\hfill}
\bigskip
{\baselineskip 18pt
A {\it context-free grammar\/} (CFG) $G$ is a system for defining a
language over~$\Sigma↓G$. It uses a finite alphabet $V↓G$ which is
an extension of~$\Sigma↓G$; we define $N↓G=V↓G-\Sigma↓G$. It has a
{\it root symbol\/} $S↓G\in N↓G$. It has a finite set $P↓G$ of productions,
where a production is an element of $N\times V↑{\ast}$. We write
$A\tog x$ to indicate that $\langle A,x\rangle$ is a production;
we write $A\tog x↓1|x↓2\ldots|x↓n$ to indicate that
$\langle A,x↓1\rangle$, $\langle A,x↓2\rangle$, $\ldots$, $\langle A,x↓n\rangle$
are all productions. Thus $\tog$ is a relation from $N$ to~$V↑{\ast}$.
We extend it to a relation $\Rag$ from $V↑{\ast}$ to~$V↑{\ast}$,
where if $A\tog x$, $uAv\Rag uxv$. The relation
$\aRag$ is the reflexive transitive closure
of~$\Rag$. The language $L↓G$ defined by $G$ is
$\{\,x\mid S\aRag x\in\Sigma↑{\ast}\,\}$. Where
only one CFG is being discussed, we omit the subscript~$G$ and write
$V$, $\Sigma$, $P$, $S$, $→$, $\Rightarrow$, $\aRa$
and~$L$. A~language definable by a CFG we call a {\it context-free language\/}
(CFL).
}
An FAL can be defined by a CFG. Let $S=q↓0$, $N=Q$. Let the productions
be $q→aq'$, where $\delta(q,a)=q'$, and $q→a$ where $\delta(q,a)=q'\in F$.
Show by mathematical induction that if $q↓0\aRa x$,
either $\delta(q↓0,x)\in F$ or $x$ is $uq$ such that $\delta(q↓0,u)=q$.
Conversely, if a CFG $G$ is right linear (all productions belong to
$V\times(\Sigma↑{\ast}V∪\Sigma↑{\ast})$), then $L↓G$ is an~FAL.
\proclaim
Theorem.
The image of a context-free language under a GSM mapping is a context-free
language. (In the language of the Spring~1984
midterm exam, if $L$ is a CFL and $M$
a~gsm, $L\circ IO↓M$ is a~CFL.) Because this is twice as important as
the average theorem, we give two proofs.
\par
\noindent
{\bf First Proof.}
Let $M↓L$ be a PDT which generates $L$. Define the composition of the PDT
and a~GSM called~$M$ analogously to the composition of two GSMs; the result
is another PDT, with $\epsilon$ input, so it is a PDT generator, and its
input-output relation is the composition
$(\{\epsilon\}\times L)\circ IO↓M=\{\epsilon\}\times (L\circ IO↓M)$,
so it generates $L\circ IO↓M$.
\smallskip\noindent
{\bf Second Proof.}
Given the gsm $M$, with start and accepting states $S↓M$ and $F↓M$, and
transitions of the form $\langle q↓i,x,y,q↓j\rangle$, and a~CNF
grammar~$G$ for the CFL. Construct a grammar~$G'$, as follows:
\smallskip\disleft 25pt:(1):
$S↓{G'}→S↓{ij}$ for each $q↓i\in S↓M$, $q↓j\in F↓M$, where $S$ is the
root symbol of~$G$.
\smallskip\disleft 25pt:(2):
$A↓{ik}→B↓{ij}C↓{jk}$ for each $q↓i$, $q↓j$, and $q↓k\in Q↓M$, and for
each production of the form $A→BC$ in~$G$.
\smallskip\disleft 25pt:(3):
$A↓{ij}→R↓{tij}$ for each $q↓i,q↓j\in Q↓M$, and for each production
of the form $A→t$ in~$G$.
\smallskip\noindent
We need additional productions to derive $R↓{tij}\aRa y$ iff $y$ is
a string which is printed on a path from $q↓i$ to $q↓j$ that reads~$t$.
\smallskip\disleft 25pt:(4):
$R↓{tij}→yR↓{tkj}$ if $\langle q↓i,\epsilon,y,q↓k\rangle$.
\smallskip\disleft 25pt:(5):
$R↓{tij}→yR↓{\epsilon kj}$ if $\langle q↓i,t,y,q↓k\rangle$.
\smallskip\disleft 25pt:(6):
$R↓{\epsilon ij}→yR↓{\epsilon kj}$ if $\langle q↓i,\epsilon,y,q↓k\rangle$.
\smallskip\disleft 25pt:(6):
$R↓{\epsilon ii}→\epsilon$.
\smallskip\noindent
{\bf Exercise.} Show that if $L$ has an unambiguous grammar and $M$ is
deterministic, there is an unambiguous grammar for $L\circ IO↓M$.
The set of strings mapped into a member of $L$ by a~GSM is also
a~CFL. In fact, $IO↓M\circ L=L\circ IO↓{M↑{-1}}$, so no detailed
construction is needed.
\smallskip\noindent
{\bf Corollaries:} CFL's are closed under direct and inverse substitution
and homorphism, and under union and intersection with a regular set.
\proclaim
%\smallskip\noindent
Theorem. The computations of a PDT are a CFL.
\par
\noindent
{\bf Proof.} The set of paths through the flowchart is a regular set,
by the path theorem. Such a path is a computation iff the stack
operations are consistent. Let $L↓1$ be the context-free language
for stack behaviour, and $L↓2$ the same language padded in all possible
ways with non-stack operations. The intersection of the regular set
with~$L↓2$ is the set of computations of the~PDT.
\proclaim
%\smallskip\noindent
Corollary. The domain and range of a PDT are CFLs.
\par
\smallskip\noindent
{\bf Proof.} Apply the obvious substitutions to the language of computations.
\bigskip
\parindent0pt
\copyright 1984 Robert W. Floyd
First draft October 23, 1984
\bye